This notebook is the final step in a series of notebooks for doing machine learning on cloud. The previous notebook, demonstrated evaluating a model. In a real-world scenario, it is likely that there are multiple evaluation datasets, as well as multiple models that need to be evaluated, before there is a model suitable for deployment.
In [1]:
import google.datalab as datalab
import google.datalab.ml as ml
import mltoolbox.regression.dnn as regression
import os
import requests
import time
The storage bucket was created earlier. We'll re-declare it here, so we can use it.
In [2]:
storage_bucket = 'gs://' + datalab.Context.default().project_id + '-datalab-workspace/'
storage_region = 'us-central1'
workspace_path = os.path.join(storage_bucket, 'census')
training_path = os.path.join(workspace_path, 'training')
model_name = 'census'
model_version = 'v1'
In [3]:
!gsutil ls -r {training_path}/model
Cloud Machine Learning Engine provides APIs to deploy and manage models. The first step is to create a named model resource, which can be referred to by name. The second step is to deploy the trained model binaries as a version within the model resource.
NOTE: These steps can take a few minutes.
In [8]:
!gcloud ml-engine models create {model_name} --regions {storage_region}
In [9]:
!gcloud ml-engine versions create {model_version} --model {model_name} --origin {training_path}/model
At this point the model is ready for batch prediction jobs. It is also automatically exposed as an HTTP endpoint for performing online prediction.
Online prediction is accomplished by issuing HTTP requests to the specific model version endpoint. Instances to be predicted are formatted as JSON in the request body. The structure of instances depend on the model. The census model in this sample was trained using data formatted as CSV, and so the model expects inputs as CSV formatted strings.
Prediction results are returned as JSON in the response.
HTTP requests must contain an OAuth token auth header to succeed. In the Datalab notebook, the OAuth token corresponding to the environment is accessible without a requiring OAuth flow. Actual applications will need to determine the best strategy for acquringing OAuth tokens, generally using Application Default Credentials.
In [10]:
api = 'https://ml.googleapis.com/v1/projects/{project}/models/{model}/versions/{version}:predict'
url = api.format(project=datalab.Context.default().project_id,
model=model_name,
version=model_version)
headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer ' + datalab.Context.default().credentials.get_access_token().access_token
}
body = {
'instances': [
'490,64,2,0,1,0,2,8090,015,01,1,00590,00500,1,18,0,2,1',
'1225,32,5,0,4,5301,2,9680,015,01,1,00100,00100,1,21,2,1,1',
'1226,30,1,0,1,0,2,8680,020,01,1,00100,00100,1,16,0,2,1'
]
}
response = requests.post(url, json=body, headers=headers)
predictions = response.json()['predictions']
predictions
Out[10]:
It is quite simple to issue these requests using your HTTP library of choice. Actual applications should include the logic to handle errors, including retries.
While online prediction is optimized for low-latency requests over small lists of instances, batch prediction is designed for high-throughput prediction for large datasets. The same model can be used for both.
Batch prediction jobs can also be submitted via the API. They are easily submitted via the gcloud tool as well.
In [11]:
%file /tmp/instances.csv
490,64,2,0,1,0,2,8090,015,01,1,00590,00500,1,18,0,2,1
1225,32,5,0,4,5301,2,9680,015,01,1,00100,00100,1,21,2,1,1
1226,30,1,0,1,0,2,8680,020,01,1,00100,00100,1,16,0,2,1
In [12]:
prediction_data_path = os.path.join(workspace_path, 'data/prediction.csv')
In [13]:
!gsutil -q cp /tmp/instances.csv {prediction_data_path}
Each batch prediction job must have a unique name within the scope of a project. The specified name below may need to be changed if you are re-running this notebook.
In [14]:
job_name = 'census_prediction_' + str(int(time.time()))
prediction_path = os.path.join(workspace_path, 'predictions')
NOTE: A batch prediction job can take a few minutes, due to overhead of provisioning resources, which is reasonable for large jobs, but can far exceed the time to complete a tiny dataset such as the one used in this sample.
In [15]:
!gcloud ml-engine jobs submit prediction {job_name} --model {model_name} --version {model_version} --data-format TEXT --input-paths {prediction_data_path} --output-path {prediction_path} --region {storage_region}
The status of the job can be inspected in the Cloud Console. Once it is completed, the outputs should be visible in the specified output path.
In [16]:
!gsutil ls {prediction_path}
In [17]:
!gsutil cat {prediction_path}/prediction*